Feature selection via binary simultaneous perturbation stochastic approximation
نویسندگان
چکیده
Feature selection (FS) has become an indispensable task in dealing with today’s highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector’s individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.
منابع مشابه
Stochastic Optimization Algorithms for Support Vector Machines Classification
In this paper, we consider the problem of semi-supervised binary classification by Support Vector Machines (SVM). This problem is explored as an unconstrained and non-smooth optimization task when part of the available data is unlabelled. We apply non-smooth optimization techniques to classification where the objective function considered is non-convex and nondifferentiable and so difficult to ...
متن کاملRobust parameter design optimization of simulation experiments using stochastic perturbation methods
Stochastic perturbation methods can be applied to problems for which either the objective function is represented analytically, or the objective function is the result of a simulation experiment. The Simultaneous Perturbation Stochastic Approximation (SPSA) method has the advantage over similar methods of requiring only 2 measurements at each iteration of the search. This feature makes SPSA att...
متن کاملAn Overview of the Simultaneous Perturbation Method for Efficient Optimization
ultivariate stochastic optimization plays a major role in the analysis and control of many engineering systems. In almost all real-world optimization problems, it is necessary to use a mathematical algorithm that iteratively seeks out the solution because an analytical (closed-form) solution is rarely available. In this spirit, the “simultaneous perturbation stochastic approximation (SPSA)” met...
متن کاملMultivariate Optimizing Up and Down Design
Suppose we are interested in finding the optimal dose of two drugs (for example, Tylenol and Aspirin), that is, we are interested in determining the dose combination that maximizes the probability of patients’ success. We assume responses are binary, either failure or success, and that the treatments to be used in the study are selected from a lattice of combination drugs. We extend the univari...
متن کاملCoarse-to-Fine Registration of Remote Sensing Optical Images using SIFT and SPSA Optimization
Sub-pixel accuracy is the vital requirement of remote sensing optical image registration. For this purpose, a coarse-to-fine registration algorithm is proposed to register the remote sensing optical images. The coarse registration operation is performed by the scale-invariant feature transform (SIFT) approach with an outlier removal method. The outliers are removed by the Random sample consensu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 75 شماره
صفحات -
تاریخ انتشار 2016